U.S. Serial No. 09/817,005 Atty. Docket No. 20103/A00386-2 

Amendment Pursuant to 37 C.F.R. § 1.312 

Amendments to the Claims 

1. (Previously Presented) A speech reference enrollment method, comprising: 

receiving a first utterance of a word; 

extracting a plurality of features from the first utterance; 

receiving a second utterance of the word; 

extracting the plurality of features from the second utterance; 

determining a first similarity between the plurality of features from the first 
utterance and the plurality of features from the second utterance; 

when the first similarity is less than a predetermined similarity, requesting a 
user to speak a third utterance of the word; 

extracting the plurality of features from the third utterance; 

determining a second similarity between the plurality of features from the first 
utterance and the plurality of features from the third utterance; and 

when the second similarity is greater than or equal to the predetermined 
similarity, forming a reference for the word. 
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2. (Previously Presented) The method of claim 1, further comprising: 
when the second similarity is less than the predetermined similarity, 

determining a third similarity between the plurality of features from the second 
utterance and the plurality of features from the third utterance; and 

when the third similarity is greater than or equal to the predetermined 
similarity, forming the reference for the word. 

3. (Previously Presented) The method of claim 2, further comprising when the 
third similarity is less than the predetermined similarity, receiving another first utterance of 
the word. 

4. (Previously Presented) The method of claim 1, further comprising: 
determining a duration of the second utterance; and 

when the duration is less than a minimum duration, disregarding the second 
utterance. 

5. (Previously Presented) The method of claim 1, further comprising: 
determining a duration of the second utterance; and 

when the duration is greater than a maximum duration, disregarding the 
second utterance. 
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6. (Previously Presented) The method of claim 5, further comprising: 
setting an amplitude threshold; 

determining a start time when an input signal exceeds the amplitude threshold; 
determining an end time, after the start time, when the input signal is less than 
the amplitude threshold; and 

calculating the duration as a difference between the end time and the start 

time. 

7. (Currently Amended) The method of claim 1, further comprising: 
determining an estimate of a number of voiced speech frames; and 
when the estimate of the number of voiced speech frames is less than a 

threshold threshold, requesting the user repeat the word. 



8. (Previously Presented) The method of claim 1, further comprising: 
determining a signal to noise ratio of the first utterance; and 

when the signal to noise ratio is less than a predetermined signal to noise ratio, 
increasing a gain of a voice amplifier. 

9. (Previously Presented) The method of claim 8, further comprising requesting 
the user repeat the word. 

10. (Previously Presented) The method of claim 1, further comprising determining 
an amplitude histogram of the first utterance. 
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11. (Previously Presented) A speech reference enrollment method, comprising: 
requesting a user speak a word; 

detecting a first utterance; 
requesting the user speak the word; 
detecting a second utterance; 

determining a first similarity between the first utterance and the second 
utterance; 

when the first similarity is less than a predetermined similarity, requesting the 
user speak the word; 

detecting a third utterance; 

determining a second similarity between the first utterance and the third 
utterance; and 

when the second similarity is greater than or equal to the predetermined 
similarity, creating a reference. 

12. (Previously Presented) The method of claim 11, further comprising: 
determining a third similarity between the second utterance and the third 

utterance; and 

when the third similarity is greater than or equal to the predetermined 
similarity, creating the reference. 
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13. (Previously Presented) The method of claim 12, further comprising when the 
third similarity is less than the predetermined similarity, requesting the user re-speak the 
word. 

14. (Previously Presented) The method of claim 11, further comprising: 
determining if the first utterance exceeds an amplitude threshold within a 

timeout period; and 

when the first utterance does not exceed the amplitude threshold within the 
timeout period, requesting the user re-speak the word. 

15. (Previously Presented) The method of claim 11, further comprising: 
determining an estimate of a number of voiced speech frames; and 

when the number of voiced speech frames is less than a predetermined number 
of voiced speech frames, requesting the user re-speak the word. 

16. (Previously Presented) The method of claim 11, further comprising: 
determining a duration of the first utterance; 

when the duration is less than a minimum duration, requesting the user re- 
speak the word; and 

when the duration is greater than a maximum duration, requesting the user re- 
speak the word. 
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17. (Previously Presented) A computer readable storage medium containing 
computer readable instructions that, when executed by a computer, cause the computer to: 
request a user speak a word; 
receive a first digitized utterance; 

extract a plurality of features from the first digitized utterance; 

request the user speak the word; 

receive a second digitized utterance of the word; 

extract the plurality of features from the second digitized utterance; 

determine a first similarity between the plurality of features from the first 
digitized utterance and the plurality of features from the second digitized utterance; 

when the first similarity is less than a predetermined similarity, request the 
user to speak a third utterance of the word; 

extract the plurality of features from a third digitized utterance; 

determine a second similarity between the plurality of features from the first 
digitized utterance and the plurality of features from the third digitized utterance; and 

when the second similarity is greater than or equal to the predetermined 
similarity, form a reference for the word. 
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18. (Previously Presented) The computer readable storage medium of claim 17 
containing computer readable instructions that, when executed by the computer, cause the 
computer to: 

when the second similarity is less than the predetermined similarity, determine 
a third similarity between the plurality of features from the second digitized utterance 
and the plurality of features from the third digitized utterance; and 

when the third similarity is greater than or equal to the predetermined 
similarity, form the reference for the word. 



19. (Currently Amended) The computer readable storage medium of claim 18 
containing computer readable instructions that, when executed by the computer, cause the 
computer to: 

when the third similarity is less than the predetermined similarity, requesting 
request the user re-speak the word. 



20. (Previously Presented) The computer readable storage medium of claim 17 
containing computer readable instructions that, when executed by the computer, cause the 
computer to: 

determine a signal to noise ratio; and 

when the signal to noise ratio is less than a predetermined signal to noise ratio, 
request the user re-speak the word. 
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21. (Previously Presented) The computer readable storage medium of claim 20 
containing computer readable instructions that, when executed by the computer, cause the 
computer to increase a gain of an amplifier when the signal to noise ratio is less that the 
predetermined signal to noise ratio. 

22. (Previously Presented) The computer readable storage medium of claim 17 
containing computer readable instructions that, when executed by the computer, cause the 
computer to: 

determine if an amplifier gain is saturated; and 

when the amplifier gain is saturated, request the user re-speak the word. 

23. (Previously Presented) A speech reference enrollment method, comprising: 
receiving a first utterance of a word; 

extracting a plurality of features from the first utterance; 
determining a signal to noise ratio of the first utterance; 

when the signal to noise ratio is less than a predetermined signal to noise ratio, 
increasing a gain of a voice amplifier; 

receiving a second utterance of the word; and 

extracting the plurality of features from the second utterance. 
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24. (Previously Presented) The method of claim 23, further comprising: 
determining a first similarity between the plurality of features from the first 

utterance and the plurality of features from the second utterance; 

when the first similarity is less than a predetermined similarity, requesting a 
user to speak a third utterance of the word; 

extracting the plurality of features from the third utterance; 

determining a second similarity between the plurality of features from the first 
utterance and the plurality of features from the third utterance; and 

when the second similarity is greater than or equal to the predetermined 
similarity, forming a reference for the word. 

25. (Previously Presented) The method of claim 24, further comprising: 
when the second similarity is less than the predetermined similarity, 

determining a third similarity between the plurality of features from the second 
utterance and the plurality of features from the third utterance; and 

when the third similarity is greater than or equal to the predetermined 
similarity, forming the reference for the word. 
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26. (Previously Presented) The method of claim 23, further comprising: 
determining a signal to noise ratio of the second utterance; and 

when the signal to noise ratio is less than a predetermined signal to noise ratio, 
increasing the gain of the voice amplifier and receiving a third utterance of the word. 

27. (Canceled) 

28. (Previously Presented) The system of claim 30, further comprising a feature 
extractor connected to an output of the adjustable gain amplifier, wherein the feature 
extractor forms an amplitude histogram. 

29. (Currently Amended) The system of claim 30, further comprising a signal to 
noise comparator having a first input connected to a signal to noise meter and meter, a second 
input connected to a threshold, and an output of the signal to noise comparator is connected 
to a gain input of the adjustable gain amplifier. 
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30. (Previously Presented) A speech recognition system, comprising: 

an amplitude threshold detector connected to an input speech signal; 

an adjustable gain amplifier connected to the input speech signal; 

an amplitude comparator to compare an output of the adjustable gain amplifier 
to a saturation threshold; and 

a feature comparator connected to an output of a feature extractor, wherein a 
gain input of the adjustable gain amplifier can be adjusted both up and down during 
receipt of the input speech signal. 



31. (Original) The system of claim 30, further including a timer connected to an 
output of the amplitude threshold detector. 



32. (Original) The system of claim 30, wherein the feature extractor forms an 
amplitude histogram. 
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